Conformalizing an LSTM to Optimize Revenue for a Renewable Energy Operator using Conditional Value at Risk

Energy Markets and Data Analytics | Rutgers, Spring 2024 | Dr. Robert Mieth

Author

Daniel Moore, Laila Saleh

Published

May 8, 2024

Abstract
We optimized the market trading of a renewable energy generation operator with conditional value at risk based on probabilistic forecasts made with a conformalized Long Short-term Memory (LSTM) recurrent neural network. This work demonstrates an end-to-end workflow of how field data can be ingested, analyzed, and exploited to reduce risk exposure for the operator. This is financially beneficial to the individual operator, but taken to a large scale this methodology increases the incentive for renewable generation participation which should drive cost and emissions down. This work used only eight features with favorable results, so it is expected that further studies and more advanced models with the same architecture would provide better yield.

1 Introduction

The IEEE Hybrid Energy Forecasting and Trading Competition challenges participants to make day-ahead, half-hourly probabilistic forecasts of solar and wind energy production for a solar farm and Hornsea-1 Wind Farm in the east of England with a combined 3.6 GW capacity. The second task is to decide how much energy to commit to selling at the day-ahead price (DAP) to optimize revenues. Any difference between the committed energy and actual energy is rewarded or punished according to the single settlement price (SSP). As discussed later in the data, beyond their volatility, the DAP and SSP can also be negative indicating a surplus of energy on the market. In rare events, underproduction could be rewarded due to a negative SSP. The implied task is to also forecast the market prices so that the operator can reduce their risk exposure from both the energy production and market prices.

1.1 Motivation

This is an appropriate capstone project for this course as it applies many topics covered ranging from unit commitment and energy market trading to advanced predictive and prescriptive analytics for complex and uncertain events. It is an interesting and practical opportunity to wrestle with the available resources to make the best decisions for the operator. Lastly, we find it a compelling problem because reducing the risk for renewable energy generation operators will encourage more participation and be of a net benefit to investors, consumers, and the environment - a rare triple-win.

1.2 Objectives

We will show an end-to-end workflow where we process data to train a forecasting model and conformalize it so that its point forecasts can be transformed into probabilistic forecasts. These forecasts enable us to make market-trading decisions that consider the uncertainty in not only energy production but also in the market itself. Finally, we will demonstrate the financial benefit of leveraging the power of stochastic optimization to reduce the risk exposure of the operator.

1.3 Literature Review

What have other people done

2 Data Analysis

We have obtained datasets from two sources: the competition itself which provides the energy production data through the Rebase API and the VisualCrossing API which provides the weather data. The energy data details the solar and wind energy production and the DAP and SSP in half-hourly increments. The weather data is treated as historic for the period preceding a given forecast and as a weather forecast for the forecast horizon. If deployed, the model would need to operate only using forecasted weather data. This approach is acceptable for this study as 24-hour-ahead weather forecasts are typically very accurate and we are only incorporating basic weather features.

2.1 Data Summary

The tables below provide a sample of what the data from each source look like for a few observation times and summary statistics.

5×6 DataFrame
Data from RebaseAPI {#cell-rebase-data}
Row DateTime Solar Wind TotalEnergy DAP SSP
DateTime Float32 Float32 Float32 Float32 Float32
1 2024-03-05T03:00:00 0.0 587.3 587.3 54.33 44.0
2 2024-03-05T03:30:00 0.0 601.86 601.86 54.33 55.94
3 2024-03-05T04:00:00 0.0 595.92 595.92 61.81 52.82
4 2024-03-05T04:30:00 0.0 575.42 575.42 61.81 52.63
5 2024-03-05T05:00:00 0.0 563.54 563.54 71.01 57.49
5×5 DataFrame
Energy Data Summary {#cell-rebase-data-summary}
Row variable mean min median max
Symbol Float32 Float32 Float64 Float32
1 Solar 276.497 0.0 7.85413 1853.73
2 Wind 543.425 0.0 698.674 826.254
3 TotalEnergy 819.922 0.0 778.34 2367.19
4 DAP 56.0618 -23.77 61.565 112.23
5 SSP 55.2193 -88.0 55.595 177.71
5×6 DataFrame
Data from VisualCrossing {#cell-weather-data}
Row DateTime temp windspeed winddir cloudcover visibility
DateTime Float32 Float32 Float32 Float32 Float32
1 2024-03-05T03:00:00 6.9 14.8 132.0 100.0 10.0
2 2024-03-05T03:30:00 6.9 14.8 132.0 100.0 10.0
3 2024-03-05T04:00:00 6.3 13.9 140.0 94.0 8.0
4 2024-03-05T04:30:00 6.3 13.9 140.0 94.0 8.0
5 2024-03-05T05:00:00 6.2 12.5 150.0 100.0 6.0
6×5 DataFrame
Weather Data Summary {#cell-weather-data-summary}
Row variable mean min median max
Symbol Union… Any Any Any
1 DateTime 2024-02-29T23:00:00 2024-03-27T21:45:00 2024-04-23T21:30:00
2 temp 8.54791 -2.3 8.0 19.0
3 windspeed 16.9168 0.9 16.3 46.4
4 winddir 191.088 2.0 200.0 359.0
5 cloudcover 71.4654 0.0 91.6 100.0
6 visibility 14.4783 0.0 14.9 29.8

2.2 Data Visualizations

The first thing to notice when looking at the DAP and SSP time-series plots below is the volatility of the prices. The DAP exhibits some seasonality but the trend and cycles are not as clear. The SSP is more volatile as it traverses from the daily minimum to the daily maximum several times in a given day. This highlights how difficult it is for all participants to make accurate forecasts and fulfill their commitments.

Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 −50 0 50 100 150 DAP March & April 2024 2024-04-15 2024-04-17 2024-04-19 2024-04-21 −50 0 50 100 150 Week 3, April 2024 Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 −50 0 50 100 150 SSP 2024-04-15 2024-04-17 2024-04-19 2024-04-21 −50 0 50 100 150

March & April 2024 Energy Prices

Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 −50 0 50 100 150 Solar March & April 2024 2024-04-15 2024-04-17 2024-04-19 2024-04-21 −50 0 50 100 150 Week 3, April 2024 Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 −50 0 50 100 150 SSP 2024-04-15 2024-04-17 2024-04-19 2024-04-21 −50 0 50 100 150

March & April 2024 Energy Production

The plots of the DAP and SSP for the third week of April below provide a closer look at the characteristics of these prices. While predicting the DAP appears feasible, the SSP is hardly distinguishable from noise.

Table @ref shows the energy generation over March 2024 while Table shows the first week of April 2024. We see solar energy production is, as expected, seasonal while wind energy production is very cyclical. It appears to go from nothing to its full capacity in a short amount of time and stay there for a random amount of time before dropping, usually back to nothing.

Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 0 500 1000 1500 March 2024 Energy Production Solar Wind

March 2024 Energy Production

2024-04-17 2024-04-19 2024-04-21 2024-04-23 0 500 1000 1500 First Week of April 2024 Energy Production Solar Wind

First Week of April 2024 Energy Prices

0 30 60 90 DAP Monthly DAP 0 30 60 90 One Week DAP Mar-3 Mar-10 Mar-17 Mar-24 Mar-31 Apr-7 Apr-14 Apr-21 −50 0 50 100 150 SSP Monthly SSP 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 −50 0 50 100 150 One Week SSP 0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 °C

DAP, SSP vs Temperature

0 500 1000 1500 Solar Production 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 Time of Day 0 500 1000 1500 Solar Production 0 10 20 30 40 50 60 70 80 90 100 Cloud Cover (%)

Daily Solar Production vs Cloud Cover

0 500 1000 1500 Solar Production 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 Time of Day 0 500 1000 1500 Solar Production 0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 ° C)

Daily Solar Production vs Temperature

0 200 400 600 800 Wind Production 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 Time of Day 0 200 400 600 800 Wind Production 5 10 15 20 25 30 35 40 45 Wind Speed (kph)
0 200 400 600 800 Wind Production 00:00 03:00 06:00 09:00 12:00 15:00 18:00 21:00 Time of Day 0 200 400 600 800 Wind Production 0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 ° C

3 Long Short-Term Memory Model

We used an LSTM Recurrent Neural Network (RNN) to predict the energy production and market prices for the following day. RNNs are designed to handle time-series data and LSTMs are a special type of RNN that can learn long-term dependencies in the data. Combined with a dense neural network, this model can remember (and forget) time-dependent relationships and approximate the complex dynamics among the variables.

3.1 Training

We used Julia’s deep learning library, Flux, to build and train the LSTM. We created a multi-target regressor to predict the energy production and market prices in single timestep increments. Target

  • Time: we transform time to be a ha

  • Weather Columns: Temperature

loss(model, x, y) = Flux.mse(model(x), y)
loss(model, data) = loss(model, data.past, data.next)
Code
input_dims = 1 + length(weather_cols) + length(energy_cols)
out_dims = length(energy_cols)
hidden_dim = 2^3
model = Chain(
    LSTM_in=LSTM(input_dims, hidden_dim),
    LSTM_hidden=LSTM(hidden_dim, hidden_dim),
    Dense_out=Dense(hidden_dim, out_dims, σ)
)

train_loss_log = [loss(model, first(Train))]
test_loss_log = [loss(model, first(Test))]

model
Chain(
  Recur(
    LSTMCell(11 => 8),                  # 656 parameters
  ),
  Recur(
    LSTMCell(8 => 8),                   # 560 parameters
  ),
  Dense(8 => 5, σ),                     # 45 parameters
)         # Total: 12 trainable arrays, 1_261 parameters,
          # plus 4 non-trainable, 1_024 parameters, summarysize 9.879 KiB.
η = 1e-4
opt_state = Flux.setup(Adam(η), model)
# Number of epochs
epochs = 2^3

for epoch in 1:epochs

    # intializing a log for the training loss for this epoch
    temp_train_log = []
    # Resetting the model state
    Flux.reset!(model)
    # conditioning on the first training sequence
    model(first(Train).past)
    # training on the first training sequence
    for T in Train[2:end]
        train!(loss, model, T, opt_state, loss_log=temp_train_log)
    end
    
    # logging the mean training loss for this epoch
    push!(train_loss_log, mean(temp_train_log))
    Flux.reset!(model)
    model(first(Test).past)
    push!(test_loss_log, mean(loss(model, T) for T in Test))
end
2^{0} 2^{1} 2^{2} 2^{3} Epochs 0.02 0.04 0.06 0.08 Mean Squared Error Training and Testing Loss Training Loss Testing Loss

Training and Testing Loss

3.2 Performance

Is it any good?

4 Conformalizing LSTM

What are conformal predictions? Why are they good

4.1 How to?

How did we conformalize the LSTM?

4.2 Performance

Is it any good?

5 Conditional Value at Risk

What is CVAR

Why do we do it?

5.1 Implementation

How did we do it?

5.2 Performance

Was it any good?

6 Conclusions

Was any of this worth while?

What did we learn?

What could others do?